import requests
import pandas as pd, numpy as np
import json
import time
from bs4 import BeautifulSoup as soup
from matplotlib import pyplot as plt
import seaborn as sns
from PIL import Image
from io import BytesIO
data = pd.read_csv('twitter_archive_master.csv',\
parse_dates=['timestamp'])
This report is focused on presenting insights generated from wrangled dataset twitter_archive_master.csv by providing answers to questions outlined below.
It is seen that 98% of all tweets so far are tweeted from iPhones with the other halves coming from web clients and tweetDecks.
data.source.value_counts()/data.source.shape[0]
It is evident that the most tweets happened on Monday.
data['day_name']=data.timestamp.dt.day_name()
data.groupby('day_name')['tweet_id'].count().plot(kind='bar');
plt.xlabel('Day of week');
plt.ylabel('Number of tweets');
plt.title('Distribution of tweets by day of week');
Zooming in onto Monday, which was the day that most of the tweets were tweeted, it is clear that most of the tweets, as predicted by the neural network, were about the golden_retriever breed.
fav_day_dog = data.query('day_name=="Monday"')
fav_day_dog.groupby('first_class_prediction')['tweet_id'].count().nlargest(10).plot(kind='bar');
plt.xlabel('Breed');
plt.ylabel('Number of tweets');
plt.title('Distribution of dogs breeds on the day of most tweets');
Investigating the dog stage of the most tweeted dog on the most busiest tweeting day, it is clear that the majority of the golden retrievers tweeted about are in the pupper stage.
data.query('first_class_prediction=="golden retriever"').\
stage.value_counts()\
.plot(kind='bar')
plt.xlabel('Stage');
plt.ylabel('Number');
plt.title('Distribution of stage for the most popular dog(golden retriever)');
In a effort to identify the most favorite dogs on WeRateDogs, the 3 dogs with the highest favorite_count are presented as the celebrities of the dog world. It is no surprise that a handsome doggo Labrador retriever taking a swim holds the crown as the most favorite dog followed by puppa Lakeland terrier participating in a walk. The third place for most favorite dog goes to chihuahua. And what is a chihuahua without some mischeif?. As this cute chihuahua here is chewing on a broom.
fav = data.sort_values(by ='favorite_count',ascending = False)
def favorite(rank):
url = fav.iloc[rank]['jpg_url']
print('Stage: {}\nBreed: {}'.format(fav.iloc[rank]['stage']\
,fav.iloc[rank]['first_class_prediction']))
r = requests.get(url)
return Image.open(BytesIO(r.content))
fav[['favorite_count','first_class_prediction']].head(3)
favorite(0)
favorite(1)
favorite(2)
It is quite surprising that the lest favorite dog with the least favorite count are an english setter, miniature pinscher and a curly-coated retriever. However, dare to say that these deserve to be up there with the most favorite chihuahua.
fav[['favorite_count','first_class_prediction']].tail(3)
favorite(-1)
favorite(-2)
favorite(-3)
From the graph illustrated below, there is a linear relationship between favorite_count and retweet_count. Again these are highly correlated with a correlation ocoefficient os 0.98. Therefore, it is safe to say the people retweet tweets they favor and like and it can be said that people retweet what they like
plt.figure(figsize=(10,10));
print(data[['favorite_count','retweet_count']].corr());
data.plot('favorite_count','retweet_count',kind='scatter');
plt.xlabel('Favorite Counts');
plt.ylabel('Retweet Counts');
plt.title('Retweets Vs favorites');
Twitter is home to a vast and diverse community of users, and one thing that many of them have in common is their love of dogs. In fact, a recent analysis of tweets showed that a whopping 98% of them were sent from iPhones, with the rest coming from web clients and TweetDeck.
But when are these dog-related tweets being sent? It turns out that Monday is the most popular day for tweeting about our furry friends. And when we take a closer look at tweets from that day, we see that the majority of them are about the golden retriever breed.
But it's not just any golden retrievers that are getting all the attention. Our analysis found that the most tweeted-about dogs on that busiest tweeting day were all in the "pupper" stage. In other words, they were still young and adorable.
So which dogs are the most popular on Twitter? To find out, we looked at the dogs with the highest favorite count on the platform. It should come as no surprise that the top dog is a handsome Labrador retriever taking a swim, followed by a puppa Lakeland terrier out for a walk. In third place is a mischievous Chihuahua chewing on a broom.
On the other end of the spectrum, the dogs with the lowest favorite count were an English setter, a miniature pinscher, and a curly-coated retriever. But let's not forget that all dogs are deserving of love and affection, no matter their popularity on social media.
One interesting thing to note is the relationship between a tweet's favorite count and its retweet count. Our analysis found that there is a strong linear relationship between the two, with a correlation coefficient of 0.98. This suggests that people tend to retweet things that they also favor and like. So if you want your dog tweet to go viral, make sure it's both adorable and worthy of a "favorite" from your followers.